Measuring Diversity in Regression Ensembles
نویسنده
چکیده
The problem of combining predictors to increase accuracy (often called ensemble learning) has been studied broadly in the machine learning community for both classification and regression tasks. The design of an ensemble is based on the individual accuracy of the predictors and also how different they are from one another. There is a significant body of literature on how to design and measure diversity in classification ensembles. Most of these metrics are not directly applicable to regression ensembles since the regression task inherently deals with continuous valued labels for learning. To measure diversity in regression ensembles, Krogh and Vedelsby show that the quadratic error of an ensemble estimator is guaranteed to be less than or equal to average quadratic error of components. However, this does not give a way to measure or create diverse regression ensembles. This paper presents metrics (correlation coefficient, covariance, dissimilarity measure, chi-square and mutual information) that can be used for measuring diversity in regression ensembles. Careful selection of diverse models can be used to reduce the overall ensemble size without substantial loss in performance. We present extensive empirical results to show the performance of diverse regression ensembles formed by Bagging and Random Forest techniques.
منابع مشابه
Tuning diversity in bagged neural network ensembles
In this paper we address the issue of how to optimize the generalization performance of bagged neural network ensembles. We investigate how diversity amongst networks in bagged ensembles can signiicantly innuence ensemble generalization performance and propose a new early-stopping technique that eeectively tunes this diversity so that overall ensemble generalization performance is optimized. Ex...
متن کاملManaging Diversity in Regression Ensembles
Ensembles are a widely used and effective technique in machine learning—their success is commonly attributed to the degree of disagreement, or ‘diversity’, within the ensemble. For ensembles where the individual estimators output crisp class labels, this ‘diversity’ is not well understood and remains an open research issue. For ensembles of regression estimators, the diversity can be exactly fo...
متن کاملEnsemble Learning with Local Diversity
The concept of Diversity is now recognized as a key characteristic of successful ensembles of predictors. In this paper we investigate an algorithm to generate diversity locally in regression ensembles of neural networks, which is based on the idea of imposing a neighborhood relation over the set of learners. In this algorithm each predictor iteratively improves its state considering only infor...
متن کاملDiversity in neural network ensembles
We study the issue of error diversity in ensembles of neural networks. In ensembles of regression estimators, the measurement of diversity can be formalised as the Bias-VarianceCovariance decomposition. In ensembles of classifiers, there is no neat theory in the literature to date. Our objective is to understand how to precisely define, measure, and create diverse errors for both cases. As a fo...
متن کاملDiversity and degrees of freedom in regression ensembles
Ensemble methods are a cornerstone of modern machine learning. The performance of an ensemble depends crucially upon the level of diversity between its constituent learners. This paper establishes a connection between diversity and degrees of freedom (i.e. the capacity of the model), showing that diversity may be viewed as a form of inverse regularisation. This is achieved by focusing on a prev...
متن کامل